UJJBLUI 


BROWN  UNIVERSITY 


Department 


DTIC 

ELECTE 
JUN  9  1988 


A 


Computer  Science 


Maildriver:  A  Distributed  Campus-wide  Mail  System 

by  Rich  Yampetl 


ABSTRACT 

Maildriver  is  a  distributed  system  for  a  future  campus-wide 
computer  mail  service.  The  goal  of  this  paper  is  to  set  forth  the 
design  aspects  of  Maildriver  insofar  as  they  relate  to  a  distributed 
system.  Thus,  there  are  aspects  of  Maildriver  which  are  not 
covered  herein,  but  which  are  presumably  part  of  separate  research 
(such  as  the  mapping  of  user  names,  user  authentication,  details  of 
networking,  etc.).  This  paper  is  divided  into  four  parts:" 'Part  I 
discusses  Maildriver  from  a  user’s  point  of  view.  Part  II  is  con¬ 
cerned  with  implementation  details  of  the  various  Maildriver  func¬ 
tions.  Part  111  deals  with  robustness  and'erash  recovery.  Part  IV 
concludes.  In  addition,  there  is  an  Appendix  which  contains  a 
pseudo-code  listing,  based  on  the  C  language,  of  some  of  the  critical 
Maildriver  processes  (specifically,  the  master  processes). 
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Maildriver:  A  Distributed  Campus-wide  Mail  System 

by  Rich  Yampell 


Part  I.  A  User’s  View  of  Maildriver 


1.  Introduction 

Maildriver  is  a  campus  wide  computer  mail  system.  It  allows  any  member  of 
the  campus  community  with  any  kind  of  computer  access  (i.e.  a  personal  com¬ 
puter)  to  exchange  computer  mail  with  other  such  people  in  the  campus  commun¬ 
ity.  Maildriver  users  connect  to  any  one  of  a  set  of  central,  dedicated  Mail- 
driver  machines  to  either  send  or  receive  messages.  Because  Maildriver  is  a 
distributed  system,  all  services  are  available  to  the  user,  regardless  of  the  Mail- 
driver  machine  to  which  he  is  connected. 

Maildriver  is  an  acronym  for  M  -aildriver  A  -bsolutelv  I  -nsures  L  -etter  D 
-elivery.  R  -ich  I  -s  V  -ery  E  -xcited.  R  -eally. 

1.1.  Environment  of  Maildriver 

Maildriver  is  assumed  to  run  on  a  set  of  dedicated  Maildriver  machines,  all  of 
which  are  networked  such  that  any  given  machine  may  communicate  directly 
with  any  other.  Bough  initial  guestimating  suggests  three  such  machines  as  a 
starting  point,  although  the  Maildriver  design  allows  for  the  addition  of  more. 
In  any  case,  no  hard  assumptions  are  made  concerning  the  number  of  Mail- 
driver  machines. 

Maildriver  also  assumes  facilities  for  users  to  connect  to  Maildriver  machines, 
either  through  phone  lines  or  directly.  Client  computers  which  communicate 
with  Maildriver  computers  are  presumed  to  have  front-end  programs  which 
allow  them  to  communicate  with  Maildriver.  Such  front-ends  would,  of  course, 
have  to  be  tailored  for  their  target  machines.  Thus,  if  user  John  Doe  uses  his 
M  aclntosh  to  connect  to  Maildriver,  he  would  have  a  front  end  program 
tailored  for  the  Macintosh  which  knows  how  to  communicate  with  Maildriver. 
Phe  design  of  such  client  front-end  programs  seems  straight-forward,  and  is  out¬ 
side  the  scope  of  this  paper. 

1.2.  Clients  of  Maildriver 

To  be  a  client  of  the  Maildriver  system,  one  must  fulfill  three  criterion.  First, 
one  must  have  access  to  a  computer  which  can  physically  access  Maildriver. 
Second,  one  must  have  a  front-end  program  on  that  computer  which  can  com¬ 
municate  with  Maildriver.  Third,  one  must  be  a  registered  Maildriver  user. 
It  is  assumed  that  there  will  be  some  administrative  organization  to  oversee  the 


functioning  of  Maildriver;  a  user  would  become  registered  through  this  organi¬ 
zation. 

In  addition,  for  certain  privileged  members  of  the  administrative  organization, 
there  are  special  administrative  functions  available  which  may  not  be  run  by  nor¬ 
mal  clients. 


2.  Design  Goals 

The  chief  goal  in  designing  Maildriver  was  to  create  a  robust,  distributed  sys¬ 
tem  in  which  a  user  would  always  have  access  to  the  full  functionality  of  the  sys¬ 
tem.  In  particular,  the  user  should  always  be  able  to  read  all  of  his  mail.  An 
additional  goal  was  to  deal  reasonably  efficiently  with  bulk  mail.  It  was  recog¬ 
nized  that  in  a  University  environment,  there  would  likely  be  a  large  amount  of 
bulk  mailings  and  junk  mail  (i.e.  notices  to  all  undergraduates,  memos  to  all 
faculty,  etc.).  Since  the  number  of  recipients  of  such  mail  could  conceivably  be 
quite  large,  Maildriver  tries  to  make  such  mailing  as  efficient  as  possible. 

3.  Functionality 

The  Maildriver  system  makes  the  following  functions  available  to  clients  of  the 
system: 

send  letter  --  a  letter,  already  prepared  on  the  client’s 

machine  via  some  type  of  editor,  is 
deposited  with  the  Maildriver  system  for 
delivery. 

receive  mail  —  Maildriver  fetches  all  accumulated  mail 

for  the  user  and  sends  it  over  to  the 
client  machine. 

delete  letters  —  Maildriver  delete's  the  specified  letters 

from  the  user’s  mailbox. 

list  letters  —  fetches  a  quick  list  of  pending  mail  for 

the  user.  This  is  intended  to  roughly 
implement  the  Unix  (tm)  "from"  command. 

message-polling  --  a  quick  check  is  made  to  see  if  the  user 
has  any  pending  mail. 

The  "receive  mail"  function  sends  all  letters  for  the  user  to  his  client  machine. 
That  is,  it  finds  any  mail,  anywhere  in  the  system  for  the  user.  When  the  mail  is 
sent  over,  it  is  not  deleted.  Mail  only  goes  away  when  specifically  requested  to 
via  the  delete  letters”  function,  dims,  the  typical  sequence  would  be  for  the 
client  front-end  to  request  the  user’s  mail,  and  then  request,  deletion  of  all  letters 
the  user  does  not  want  preserved. 

The  message-polling"  function  is  intended  to  be  used  by  client  front-end  pro¬ 
grams  running  in  the  background  to  periodically  check  if  the  user  has  mail. 


Since  it  will  necessarily  be  run  frequently  and  by  many  users,  it  is  designed  to 
run  as  quickly  as  possible.  Thus,  only  a  "quick”  check  is  made  for  mail;  that  is, 
the  Maildriver  machine  to  which  the  user  is  connected  only  checks  its  own 
storage  to  see  if  there  is  mail,  rather  than  polling  the  entire  Maildriver  system. 
While  this  is  only  a  heuristic,  it  will  be  accurate  most  of  the  time.  This  "quick” 
checking  approach  also  applies  to  the  "list  letters”  function. 

In  general,  Maildriver  functions  send  back  an  acknowledgement  when  the  task 
is  completed.  If  a  client  receives  an  acknowledgement,  it  may  proceed  with  the 
certain  knowledge  that  the  task  has  been/will  be  fulfilled.  If  no  acknowledge¬ 
ment  is  forthcoming,  something  may  be  awry  with  that  particular  Maildriver 
machine  (i.e.  it  has  crashed),  and  the  client  would  then  simply  connect  to  another 
Maildriver  machine  and  repeat  the  request.  Note  that  no  damage  will  be  done 
by  a  partially  completed  request. 

In  addition  to  these  general  functions,  the  following  administrative  functions  are 
available  to  qualified  administrators: 

add  user  —  a  new  user  is  added  to  the  system 

delete  user  --  an  old  user  is  removed  from  the  system 

change  master  —  forces  a  specified  machine  to  become 
the  "Master”.  More  on  this  later. 


4.  A  Sample  Session 

Let  us  consider  a  sample  session  of  user  interaction  with  Maildriver.  Take  the 
case  of  our  good  friend  John  Doe  and  his  trusty  Brand  X  personal  computer. 

Johnny  wants  to  write  a  letter  to  his  girlfriend  Pauline,  who  he  knows  is  on  the 
Maildriver  system.  He  fires  up  his  trusty  Brand  X  (which  includes  starting  up  a 
daemon  called  XCLOCK,  which  periodically  polls  Maildriver  for  mail  via  the 
"message-polling”  function)  and  gets  into  XTLXT,  the  Brand  X  text  editor. 
I  sin g  XTLXT,  he  composes  his  letter  to  Pauline.  While  editing  the  letter,  the 
daemon  finds  out  that  he  in  fact  has  mail  waiting  for  him  and  lights  up  a  MAIL 
light  on  his  screen.  John  finishes  the  letter,  saves  it,  and  enters  XMAIL,  the 
front-end  program  to  Maildriver  for  the  Brand  X.  He  directs  XMAIL  to  send 
the  letter  to  Pauline.  XMAIL  gets  the  letter,  connects  to  Maildriver,  and  sends 
the  letter  via  the  "send  letter"  function.  Maildriver  sends  back  an  ack¬ 
nowledgement  that  it  got  the  letter,  and  XMAIL  tells  John  that  everything  is  A- 
OK.  Then  John  directs  XMAIL  to  get  his  mail  for  him.  XMAIL  sends  Mail- 
driver  the  "receive  mail"  directive  and  Maildriver  sends  back  all  the  letters 
waiting  for  John.  XMAIL  saves  the  letters  as  they  come  in  and  presents  them  to 
John  in  a  user-friendly  way  (Brand  X  computers  are  notoriously  user-friendly).  It 
turns  out  that  John  got  three  letters.  John  reads  them  all,  and  decides  that  he 
wants  to  preserve  one  of  them  within  Maildriver  and  deal  with  it  later;  this  he 
informs  XMAIL.  To  comply  with  this  request.  XMAIL  sends  a  "delete  letters" 
request  to  Maildriver  for  the  two  letters  that  John  does  not  want  preserved. 
Again.  Maildriver  sends  a  back  an  acknowledgement.  John  is  finished,  and 
XMAIL  breaks  connection. 


Note  that  the  specifics  of  communication  protocol  are  outside  the  scope  of  this 
paper  and  are  not  described  here. 


Part  II.  Implementation  Details  (or,  ”A  look  under  the  hood'’) 

5.  Overview  of  System  Internals 

In  order  to  meet  the  design  goal  of  providing  full  service  to  all  users  at  all  times, 
Maildriver  is  a  fully  replicated  system.  This  means  that  all  services  are  avail¬ 
able  through  any  Maildriver  machine,  and  in  particular,  user  mail  is  replicated 
onto  each  Maildriver  machine.  Thus,  when  a  given  letter  is  deposited  into  the 
Maildriver  system,  each  Maildriver  machine  gets  a  copy  of  that  letter.  Simi¬ 
larly,  when  a  letter  is  deleted,  it  must  be  deleted  off  of  every  machine  in  the  sys¬ 
tem. 

To  coordinate  all  this  replication,  Maildriver  designates  one  of  its  machines  to 
be  the  ’’Master”.  The  Master  machine  functions  just  like  any  of  the  other 
machines,  except  that  it  runs  an  additional  process,  the  Master  process,  which 
maintains  open  communications  channels  with  all  the  other  machines. 

The  job  of  the  Master  is  two-fold.  First,  it  directs  and  keeps  track  of  the  state  of 
letter  replication.  Second,  it  keeps  track  of  which  Maildriver  machines  are  up 
and  which  are  down;  for  each  down  machine,  it  keeps  a  list  of  what  actions  that 
machine  will  need  to  perform  when  it  comes  back  up  in  order  to  catch  up  to  the 
rest  of  the  world. 

In  addition  to  the  Master,  one  machine  is  designated  the  Lieutenant.  The  job  of 
the  Lieutenant  is  simply  to  keep  a  duplicate  copy  of  the  Master's  information;  in 
the  event  that  the  Master  crashes,  the  Lieutenant  takes  over  and  becomes  Mas¬ 
ter. 

The  initial  appointments  of  Master  and  Lieutenant  are  specified  in  startup  files 
which  are  read  when  the  system  first  comes  up. 

Lach  machine  maintains  a  set  of  special  directories.  The  most  important  of  these 
is  the  \V  ofkSpace  directory.  This  directory  is  used  to  process  the  delivery  of 
mail.  In  addition  to  the  Workspace  directory,  each  machine  has  a  Master  direc¬ 
tory.  a  Cemetery  directory,  and  a  Lieutenant  directory.  Only  the  current  Master 
and  the  current  Lieutenant,  respectively,  actually  use  these  directories  (but  since 
any  machine  may  at  some  point  attain  either  of  these  ranks,  all  machines  have 
these  directories  set  up  in  advance). 

Aside  from  the  special  directories,  each  machine  maintains  a  directory  for  each 
Maildriver  user.  These  directories  are  known  as  the  user  InBoxes.  ICach  file  in 
an  In  Box  directory  is  a  discrete  letter. 

Lvery  machine  has  running  on  it  a  process  known  as  the  Server  process.  The  job 
of  the  Server  process  is  to  wait  for  a  user  to  make  connection,  and  then  fork  off  a 
copy  of  itself  to  process  the  users  requests.  The  Server  process  is  then, 
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essentially,  the  ’’front-end”  of  Maildriver.  It  is  the  only  part  of  the  system  that 
clients  actually  see. 

Each  machine  also  has  a  back-end  process  which  actually  handles  message 
delivery,  called  the  Delivery  process.  The  Delivery  process  takes  its  orders  from 
the  Master,  which  generally  consist  of  things  like  sending  and  deleting  letters. 

Finally,  each  machine  runs  what  is  called  a  Pulse  process.  The  sole  job  of  the 
Pulse  process  is  to  let  the  Master  know  that  that  machine  is  still  up  and  running. 
The  Pulse  process  normally  sits,  dormant,  until  a  polling  message  comes  in  from 
the  Master,  to  which  sends  a  simple  response  indicating  life.  Thus,  the  idea  is 
that  the  Master  takes  the  "pulse”  of  a  given  machine. 


0.  The  Structure  of  the  Master 

The  job  of  the  Master  is  complex  and  multifaceted.  In  order  to  keep  things 
organized  and  running  smoothly,  the  Master  process  forks  off  several  auxiliary 
processes  to  perform  specific  tasks.  Thus,  the  Master  machine  actually  runs  a  set 
of  master  processes;  of  these,  the  parent  and  chief  is  the  Master  process. 

The  most  important  of  these  is  called  the  Caretaker.  The  Caretaker  is  responsi¬ 
ble  for  issues  of  life  and  death.  It  regularly  polls  the  Pulse  processes  of  the  sun¬ 
dry  machines  to  verify  that  they  are  still  alive.  In  this  manner,  it  can  determine 
when  a  machine  has  died  and  can  take  appropriate  action.  This  will  be  discussed 
in  detail  in  section  10.  (of  course,  it  is  useless  for  the  Caretaker  to  check  the 
Pulse  of  its  own  machine;  hence,  the  Lieutenant  is  responsible  for  polling  the 
Master’s  Pulse  process). 

The  other  processes  created  by  the  Master  process  are  called  Sender  processes. 
One  Sender  process  is  created  for  each  Maildriver  machine  in  the  system.  The 
job  of  the  Sender  is  straight-forward:  it  accepts  messages  for  its  machine,  queues 
them  up.  and  sends  them  over  when  appropriate.  In  this  way,  both  the  Master 
and  Caretaker  processes  can  send  messages  to  the  sundry  machines  in  an  orderly 
fashion. 

The  setup,  then,  is  as  follows.  Both  the  Master  and  the  Caretaker  can  send  mes¬ 
sages  to  any  machine  via  the  Sender  processes.  The  Sender  processes  are  con¬ 
nected  to  the  Delivery  and  Server  processes  of  their  respective  machines.  The 
Master  and  Caretaker  can  also  communicate  directly  with  each  other.  In  addi¬ 
tion.  the  Caretaker  has  separate  network  connections  to  the  Pulse  processes  of 
each  machine.  Finally,  the  Master  has  a  separate  connection  to  the  Lieutenant 
process  on  the  Lieutenant  machine  (and  the  Lieutenant  has  another  separate  con¬ 
nection  to  the  Master's  Pulse). 

Since  all  communication  out  from  the  Master  goes  out  through  the  Sender 
processes,  they  will  not  be  discussed  further;  that  is,  we  will  in  general  refer  to 
the  Master  sending  a  message  to  a  machine  with  the  implicit  understanding  that 
in  fact  the  Sender  acts  as  intermediary. 
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7.  Message  Acceptance:  beneath  the  ”send  letter”  function 
7.1.  The  Job  of  the  Server  Process 

When  a  user  wishes  to  send  a  letter  via  the  Maildriver  system,  he  connects  to  a 
Maildriver  machine  (which  is  to  say,  he  connects  with  the  Server  process  for 
that  machine;  the  Server  process  then  forks  off  a  copy  of  itself  to  service  th.  user) 
and  deposits  his  letter  via  the  "send  letter”  function. 

When  the  Server  process  gets  the  "send  letter”  request,  it  sets  about  creating  a 
thing  known  as  a  "Protoletter”.  First,  it  creates  a  unique  filename  of  the  form 
< local  machine  name>.<time  stamp,  given  that  machine’s  view  of  time>. 
Thus,  a  filename  might  look  something  like  MDl. 4030382.  It  then  creates  a  file 
by  this  name  in  the  Workspace  directory  and  writes  a  one  page  (that  is,  disk 
page]  header  containing  the  list  of  recipients  of  the  letter  [note:  mail  aliases  are 
expanded  at  this  time  through  a  process  outside  the  scope  of  this  paper]  and  a 
State  flag,  which  in  general  may  be  set  to  either  ’’valid”  or  ’’bogus";  it  sets  it  now 
to  "bogus”.  It  then  reads  the  body  of  the  letter  from  the  client  and  appends  it 
onto  the  Protoletter.  [Note  that  the  header  is  a  full  disk  page  long,  regardless  of 
how  much  of  it  is  actually  used.  This  is  because  writing  one  disk  page  is  con¬ 
sidered  an  atomic  operation] 

Once  it  has  the  Protoletter  set  up,  it  notifies  the  Master  and  sends  it  the  Pro¬ 
toletter.  The  Master  takes  the  Protoletter  and  puts  it  in  its  Master  directory 
(using  the  original  unique  filename,  of  course).  When  it  has  it,  it  sends  the 
filename  to  the  Lieutenant  (not  the  file,  just  the  filename).  When  the  Lieutenant 
acknowledges,  it  rewrites  the  header,  changing  the  State  field  to  "valid”,  and 
sends  acknowledgement  back  to  the  original  machine  (note:  the  original  machine 
may  well  have  been  the  Master  machine  itself.  Remember  that  the  Master 
machine  functions  just  the  same  as  anv  other  machine,  except  that  it  has  the 
Ma  ster  processes  on  it].  When  the  original  machine  gets  acknowledgement  from 
the  Master,  it  in  turn  sends  out  acknowledgement  to  its  client.  Thus,  to  insure 
letter  delivery,  what  is  really  necessary  is  to  get  a  valid  copy  out  to  the  Master. 
Note  that  recipient  parsing  is  done  on  the  original  machine  to  avoid  overloading 
the  Master  machine. 

Now  the  client  can  go  his  merry  way,  and  the  forked  Server  process  is  free  to  die 
(if  the  client  has  no  further  requests).  The  Master  picks  up  the  ball  from  here. 


7.2.  The  Job  of  the  Master 

The  Master  now  sends  a  copy  of  the  Protoletter  to  each  Maildriver  machine’s 
Delivery  process  (including  its  own,  of  course).  Each  Delivery  process  then  copies 
the  Protoletter  into  the  Workspace  for  its  machine  and  proceeds  to  process  the 
delivery  of  that  letter.  Actually,  there  is  one  exception  to  this  procedure:  the 
Master  does  not  send  the  whole  Protoletter  to  the  machine  which  originally 
received  the  letter  (since  that  machine  already  has  a  copy  of  it);  it  just  sends  it  a 
special  message  which  in  human  terms  translates  out  to  ’’you  already  have  the 
letter,  just  mark  it  valid'  and  process  it"  [note  that  the  Master  doesn't  even  need 
to  remember  which  machine  originally  received  the  letter,  since  the  machine 
name  is  part  of  the  filename]. 
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Each  Delivery  process  receives  the  Protoletter  from  the  Master  and  sends  back  an 
acknowledgement.  It  then  processes  the  Protoletter  in  the  manner  described 
below,  and  then  sends  another  acknowledgement  to  the  Master  [thus,  the  first 
acknowledgement  means  ”1  got  it",  and  the  second  means  "I’ve  finished  it").  The 
Master,  meanwhile,  keeps  track  of  which  machines  have  acknowledged  and  which 
have  not.  When  all  machines  have  acknowledged  that  they're  finished,  the  Mas¬ 
ter  can  discard  these  records,  as  well  as  the  copy  of  the  Protoletter  in  its  Master 
directory  (remember  that  by  now  it  has  a  copy  in  its  Workspace). 

7.3.  The  Job  of  the  Delivery  Process 

The  Delivery  process  must  now  perform  the  actual  delivery  of  the  letter,  using 
the  Protoletter.  What  it  does  is  read  the  header  of  the  Protoletter,  which  con¬ 
tains  the  recipient  list  for  that  letter.  Then,  for  each  recipient,  it  puts  a  link  to 
the  Protoletter  into  the  InBox  directory  for  that  user.  It  then  rewrites  the  header 
for  the  Protoletter  to  no  longer  contain  that  user  in  the  recipient  list.  It  proceeds 
this  way  until  the  recipient  list  is  empty,  at  which  point  it  can  delete  the  Pro 
toletter  and  send  final  acknowledgement  to  the  Master. 

If  at  any  point  in  delivery  the  Delivery  process  tries  to  deliver  a  letter  which  is 
already  there,  it  ignores  the  new  one  in  favor  of  the  old  one.  More  specifically,  if 
the  call  to  "link”  fails,  and  the  error  number  shows  the  error  to  be  an  attempt  to 
create  a  link  where  a  file  already  exists,  the  Delivery  process  simply  continues, 
pretending  there  was  no  error  (as  opposed  to  taking  some  kind  of  emergency 
action  for  an  unexpectedly  failed  syscall).  This  should  virtually  never  happen, 
but  there  may  be  some  very  weird  extreme  circumstances  where  it  might.  This 
rule  of  thumb  works  around  all  such  instances. 

Note  the  key  idea  here  is  the  the  Delivery  process  makes  links,  not  copies,  of  the 
Protoletter.  Thus,  there  is  only  one  copy  of  a  letter  per  machine,  regardless  oT 
the  number  of  recipients  of  the  letter.  This  is  the  vital  step  towards  realizing  our 
goal  vis-a-vis  bulk  mail.  Since  letters  never  get  edited  or  changed  in  any  way 
once  delivered,  there  is  no  reason  not  to  do  it  with  links.  Note  also  that  this 
decision  to  use  links  is  what  leads  to  the  one  letter  per  file  mail  structure  (as 
opposed  to  one  long  file  as  the  user's  inbox,  a  la  I'nix  (tm)). 


7.4.  The  Job  of  the  Lieutenant 

During  letter  delivery,  it  is  important  that  the  Lieutenant  keep  track  of  what's 
going  on.  Just  as  the  Master  machine  is  just  a  normal  machine  with  a  Master 
process  running  on  it,  so  the  Lieutenant  machine  is  a  normal  machine  with  a 
Lieutenant  process  on  it.  The  Lieutenant  process  sits  waiting  to  hear  from  the 
Master  (and  occasionally  polling  it  to  make  sure  its  still  alive).  When  the  Master 
first  gets  a  new  letter,  it  sends  the  filename  to  the  Lieutenant,  who  records  the 
filename  for  future  reference.  In  this  way  it  knows  that  there  is  a  letter  in  mid¬ 
delivery.  When  all  Delivery  processes  have  acknowledge  back  to  the  Master,  the 
Master  sends  the  Lieutenant  a  special  acknowledgement  which  means,  essentially, 
"its  ok,  that  letter  got  sent  ok,  you  can  forget  about  it",  and  the  Lieutenant 
deletes  that  send  request  from  its  records.  All  this  will  become  relevant  in  sec¬ 
tion  1 1. 


8.  Message  Retrieval:  beneath  the  "receive  mail”  function 


When  a  user  uses  the  "receive  mail”  function  to  get  his  mail,  it  is  our  goal  to 
make  sure  he  gets  all  of  his  mail,  even  if  some  of  it  is  only  half  delivered.  When 
the  client  machine  calls  up  a  Maildriver  machine,  gets  connected  to  a  Server 
process,  and  makes  a  "receive  mail”  request,  the  Server  process  goes  through  a 
number  of  steps  to  get  that  client’s  mail. 

First,  the  Server  process  sends  a  message  to  the  Master,  a  message  which  in 
English  would  translate  roughly  to  ”Iley,  this  guy  here  wants  his  mail.  If  you've 
got  anything  for  him,  send  it  right  over.”  The  Master  then  scans  all  of  its 
recently  received  letters,  looking  for  any  which  have  not  yet  been  replicated  to 
the  Server's  machine  and  for  which  the  user  in  question  is,  in  fact,  a  recipient.  If 
it  finds  any  such  letters,  it  immediately  sends  them  to  the  Delivery  process  in  the 
normal  manner.  These  letters,  then,  will  show  up  presently  as  Protoletters  on 
the  desired  machine;  all  that  really  happens  is  that  the  user's  mail  is  moved-to- 
the-front-of-tho-line,  as  it  were. 

Next,  the  Server  process  examines  each  Protoletter  in  the  Workspace.  For  each 
Protoletter,  it  reads  the  header  page  and  looks  to  see  if  its  user  is  on  the  recipient 
list.  There  are  actually  four  possible  cases,  with  corresponding  actions  taken. 

If  the  user  is  not  in  the  list,  it  can  ignore  this  Protoletter  and  go  on  to  the  next 

one. 

If  the  user  is  in  the  list,  but  the  State  flag  is  marked  "bogus",  then  the  letter  is 
not  done  being  replicated  (it  is  probably  one  being  sent  over  by  the  Master  on 
rush  order,  as  a  result  of  the  first  step),  and  what  we  want  to  do  is  wait  on  it. 
The  Server  process  puts  the  name  of  this  Protoletter  on  a  special  list  called  the 
RTCBACIL  list  (the  "Remember  To  Come  Back  And  Check  It  Later"  list)  and 
goes  on  to  the  next  Protoletter. 

If  the  user  is  the  first  person  in  the  list  (and  the  State  is  "valid"),  it  means  that 
there  is  a  Delivery  process  running  right  now  delivering  the  letter  to  the  user,  so 
the  tiling  to  do  is  just  wait.  Again  the  name  is  put  on  the  RTCBACIL  list  and 
scanning  cont  inues. 


If  the  user  is  on  the  recipient  list  (and  the  State  is  "valid"),  but  is  not  the  first 
person,  then  the  Server  process  locks  the  Protoletter  file,  and  proceeds  to  deliver 
the  letter  itself,  proceeding  just  as  the  Delivery  process  would.  Since  the  file  is 
locked,  the  Delivery  process  will  have  to  wait  before  it  can  start  delivery  to  a  new 
user  (and  thus  we  avoid  conflicts  of  two  processes  trving  to  write  to  the  same 
file—  both  the  Protoletter  file'  and  the  actual  letter  file).  After  the  Server  process 
has  delivered  the  letter  and  rewritten  the  Protoletter  header,  it  unlocks  the  file 
and  proceeds  to  the  next  Protoletter. 

finally,  the  Server  process  must  process  the  RTCBACIL  list.  To  do  this,  it 
repeatedly  scans  the  list,  taking  the  same  actions  as  when  it  scanned  the 
Workspace  initially,  except  of  course  that  instead  of  putting  something  on  the 
RTCBACIL  list,  it  just  leaves  it  there,  and  when  the  user  isn  7  on  the  recipient 
list,  that  name  is  removed  from  tin*  RTCBACIL  list  (a  user  would  cease  to  be  a 
recipient  if  A)  the  Delivery  process  has  already  delivered  it  or  B)  t ho  Server 
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process  delivered  it  itself). 

At  this  point,  all  the  mail  there  is  for  the  user  has  wound  up  in  his  InOox.  The 
Server  process  may  now  send  all  the  letters  out  to  the  client  program. 

9.  A  Look  at  Other  Functions 


9.1.  The  “delete  letters”  function 

Deleting  a  letter  is  not  something  that  needs  to  happen  particularly  quickly. 
That  is.  a  user  is  not  sitting  around  waiting  for  immediate  results  of  a  deletion. 
This  fact  is  made  use  of  in  the  algorithm  to  delete  a  letter. 

When  a  client  program  gives  a  “delete  letters”  directive,  the  Server  process 
processes  each  letter  specified  in  the  delete  list--  that  is,  each  letter  the  user 
wants  deleted—  in  the  following  manner.  First,  it  verifies  that  the  letter  does  in 
fact  exist  in  the  user's  In  Box  directory  (no  point  in  deleting  a  letter  he  doesn’t 
actually  have).  Then  it  sends  the  request  to  the  Master,  specifying  the  user  and 
letter  names;  the  Master  in  turn  sends  the  request  to  the  Lieutenant.  The  Lieu¬ 
tenant  gets  the  request,  marks  it  down,  and  sends  hack  acknowledgement  to  the 
Master.  Now  that  the  Master  and  the  Lieutenant  have  made  note  of  what  letters 
need  to  be  deleted,  the  Master  sends  hack  acknowledgement  to  the  Server  pro¬ 
cess,  who  in  turn  sends  acknowledgement  to  the  client  program.  The  user  may 
go  about  his  business. 

Now,  the  Master  examines  its  view  of  the  world.  For  each  letter  in  the  delete 
list,  it  looks  to  see  if  that  letter  is  in  mid-delivery.  If  not.  deletion  may  proceed 
in  a  straight-forward  fashion.  The  Master  sends  out  delete  requests  to  the 
Delivery  processes  of  the  sundry  Maildriver  machines.  When  each  machine  has 
acknowledged  completion  of  the  deletion,  the  Master  sends  word  to  the  Lieu¬ 
tenant,  who  now  ceases  to  remember  the  deletion  request. 

If  the  letter  i’.s  in  mid-delivery,  then  something  more  elaborate  is  required  in  order 
to  avoid  the  case  of  a  machine  attempting  to  delete  a  letter  which  it  has  not  yet 
even  received.  Consider,  for  example,  the  story  of  John  and  Pauline  using  a 
Maildriver  system  configured  for  four  machines.  A,  B,  C.  A  D.  Suppose  that 
Pauline  connects  to  machine  A  and  invokes  the  "send  letter"  function  to  send  a 
letter  to  John.  Replication  and  delivery  of  the  letter  now  proceeds  as  described 
in  section  7.  Meanwhile  John  connects  to  machine  B  and  asks  for  his  mail  via 
the  receive  mail”  function.  As  it  happens.  B  has  already  managed  to  deliver  its 
copy  of  Pauline's  letter  to  John’s  InBox.  in  time  for  John  to  receive  it;  John  gets 
it  and  reads  it  right  away.  Now,  having  read  it.  he  wishes  to  delete  it,  and  issues 
an  appropriate  "delete  letters”  directive.  However,  despite  the  fact  that  Pauline’s 
letter  has  already  been  successfully  delivered  to  machines  A,  B,  and  D,  it  seems 
that  machine  ('  is  still  in  the  process  of  delivering  it.  Things  could  get  real  sticky 
if  the  Master  were  to  send  out  the  deletion  request  right  at  this  moment,  because 
C  could  find  itself  attempting  to  delete  an  as-yet  undelivered  letter.  This  is  pre¬ 
cisely  the  situation  we  wish  to  avoid. 

The  way  around  this  problem  is  to  simply  defer  deletion  until  delivery  is  com¬ 
pleted.  Thus,  if  the  Master  sees  that  the  letter  to  be  deleted  is  in  mid-delivery,  it 
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puts  the  letter  on  a  queue  (the  Delete  queue),  and  forgets  about  it.  When  a 
delivery  completes,  the  Master  checks  the  Delete  queue;  if  it  finds  the  just- 
completed  letter  there,  it  performs  the  deletion  as  above.  While  this  algorithm 
may  not  be  the  most  efficient  time-wise,  it  »s  clearly  the  simplest  and  cleanest. 
Moreover,  as  already  noted,  speed  is  not  a  prime  concern  in  letter  deletion. 


9.2.  The  "list  letters"  and  "message-polling"  functions 

Th  ese  two  functions  are  very  similar  in  nature  and  are  both  handled  in  a  trivial 
fashion  for  efficiency  reasons.  In  either  case,  the  Server  process  simply  looks  at 
the  InBox  directory  for  the  user  on  its  local  Maildriver  machine.  For 
"message-polling",  it  returns  true  or  false  depending  on  whether  or  not  there  is 
mail  waiting  in  the  user's  InBox.  For  "list  letters”,  it  returns  the  list  of  what 
letters  exist  in  the  InBox. 

Again,  note  that  no  attempt  is  made  to  garner  information  from  the  rest  of  the 
Maildriver  system.  For  "message-polling",  the  reason  is  that  message  polling  is 
likely  to  happen  so  frequently  that  it  is  desirable  to  have  it  be  as  small  a  drain  on 
the  Maildriver  system  as  possible.  The  "list  letters"  function,  on  the  other 
hand,  is  likely  to  be  invoked  rather  rarely,  so  much  so  that  it  hardly  seems  worth 
the  effort  to  race  the  user's  mail  through  just  to  get  it  into  the  list. 


9.3.  The  administrative  functions 

In  general,  for  all  administrative  functions,  the  Server  process  just  passes  the 
request  along  to  the  Master  (The  Server  first  verifies  that  the  usot  is  privileged  to 
perform  an  administrative  action,  but  how  this  is  done  is  outside  the  scope  of  this 
paper]. 

The  "add  user"  and  “delete  user"  administrative  functions  are  really  very  trivial 
to  do.  Again,  the  Master  first  off  informs  the  Lieutenant  of  the  request.  Then 
the  user  is  added/deleted  from  the  system  namespace  somehow  [again,  outside 
the  scope  of  this  paper],  and  requests  to  add/delete  the  InBox  for  that  user  are 
sent  out.  When  the  Master  gets  back  all  the  acknowledgements,  the  Lieutenant 
is  informed  that  all  went  well. 

The  "  ,iange  master"  function  would  likely  get  very  little  use.  Its  only  purpose  is 
to  allow  the  administrators  to  force  a  specific  Maildriver  machine  to  be  the 
Master.  This  would  only  happen  if,  for  instance,  one  of  the  Maildriver 
machines  was  more  powerful  than  the  others  and  therefore  ought  to  carry  the 
extra  burden  of  being  Master.  If  this  were  the  case,  though,  it  would  normally  be 
appointed  the  Master  anyway  by  the  default  startup  routines,  so  this  function 
would  only  be  useful  after  the  more-powerful  machine  had  crashed  (whereupon 
another  machine  became  master)  and  was  now  up  again. 

To  implement  the  function,  the  Server  process,  again,  sends  the  request  to  the 
current  Master.  The  current  Master  sends  out  messages  to  the  Caretaker  and 
Sender  processes  which  mean,  in  English,  "Finish  up  what  you're  doing  and  exit". 
Then  it  sends  all  its  special  Master  data  (the  contents  of  the  Master  directory)  to 
the  Master-to-be,  who  now  takes  over  as  Master  (the  old  Master  sends  out  an 
announcement  to  all  the  Maildriver  machines  about  the  change).  If  the  new 
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Master  had  previously  been  the  Lieutenant,  it  must  now,  of  course,  appoint  some 
other  machine  Lieutenant  and  send  over  the  Lieutenant  information. 


Part  III.  Robustitude 


10.  When  a  server  crashes 


10.1.  When  a  server  goes  down 

Eventually,  of  course,  machines  crash.  When  a  Maildriver  machine  crashes,  life 
goes  on.  Steps  are  taken  to  see  to  it  that  things  proceed  smoothly.  In  general, 
the  idea  is  to  keep  track  of  everything  which  happens  while  the  machine  is  dead, 
so  that  it  may  catch  up  when  it  comes  back  up. 

It  is  the  job  of  the  Caretaker  to  periodically  poll  the  sundry  Maildriver 
machines  to  verify  that  they  are  still  up  and  running.  If  it  detects  a  down 
machine,  it  places  the  name  of  that  machine  in  a  Dead  Machine  List  (DML). 
Then  it  creates  a  list  for  that  machine,  called  the  Dead  Machine  Catchup  List 
(DMCL),  which  will  contain  all  actions  which  the  dead  machine  will  need  to  per¬ 
form  to  get  caught  tip  when  it  eventually  comes  back  to  life.  It  also  notifies  the 
Master  of  the  death.  The  Master  does  several  things  to  deal  with  the  death. 
First,  changes  its  own  file  descriptor  with  which  it  normally  communicates  with 
that  machine;  it  changes  it  to  communicate  instead  with  the  Caretaker.  Thus,  it 
will  be  able  to  continue  normal  processing  in  a  normal  fashion—  requests  for  the 
dead  machine  are  now  simply  re-routed  to  the  Caretaker.  Next,  it  sends  word  of 
the  death  to  the  Sender  process  for  the  dead  machine,  which  then  proceeds  to 
send  everything  in  its  queue  to  the  Caretaker.  The  Caretaker,  then,  receives  all 
requests  for  the  dead  machine,  both  initially  from  the  Sender,  and  subsequently 
from  the  Master,  as  normal  processing  continues.  Finally,  the  Master,  not 
surprisingly,  sends  word  of  the  death  to  the  Lieutenant,  which  keeps  a  DML  and 
DMC'Ls  of  its  own.  Since  the  Lieutenant  already  receives  special  notification  of 
each  request  in  the  system  as  it  happens,  it  does  not  need  additional  notification 
for  each  dead  machine  request.  When  it  gets  general  notification,  it  simply  uses 
its  own  DML  to  put  things  in  DMC'Ls  as  appropriate. 


Note,  of  course,  that  if  the  machine  that  died  was  the  Lieutenant,  then  the  Mas¬ 
ter  must  promote  a  new  Lieutenant  first  thing  before  it  can  do  anything  else.  It 
may  choose  the  next  Lieutenant  in  any  number  of  ways:  for  example  it  may  pick 
the  machine  with  the  lightest  load;  it  might  also  resort  to  a  built-in  numbering 
scheme.  The  choice  of  next  Lieutenant  is  not  too  critical;  what  is  critical  is  that 
one  be  chosen  and  sent  all  relevant  information  (pending  sends,  pending  deletes, 
DML's  and  DMC'L  s). 

Normal  processing  now  continues  amongst  the  living  machines,  except  that  no 
further  requests  are  sent  to  the  dead  machine—  they  instead  end  up  at  the  Care¬ 
taker,  on  an  appropriate  DMCL  (and  on  the  Lieutenant’s  duplicate  DMCL).  The 
exception  is  for  delete  requests.  If  a  dead  machine  requires  a  delete  request,  the 
DMCL  is  first  scanned  to  see  if  it  contains  a  delivery  request  for  the  same  letter; 
if  so,  then  the  delivery  request  is  removed  and  the  delete  request  is  discarded. 


No  sense  delivering  a  message  which  will  immediately  be  deleted! 

Also,  if  a  given  request  is  for  letter  delivery,  both  the  Caretaker  and  the  Lieu¬ 
tenant  record  only  the  filename  of  the  letter  in  the  DMCL  —  not  the  whole  letter. 
This  is  simply  to  save  time.  The  Caretaker,  however,  can  easily  get  the  letter 
anyway.  It  just  makes  a  link  into  the  Cemetery  directory  of  the  copy  of  the  Pro¬ 
toletter  that  exists  in  Master  directory.  We  are  guaranteed  that  a  copy  exists 
there  since  the  Master  is  currently  sending  out  requests  to  deliver  that  letter. 


10.2.  When  a  server  comes  back  up 

When  a  dead  server  comes  back  up,  it  starts  out  by  cleaning  up  its  Workspace, 
getting  rid  of  useless  things,  and  finishing  up  what  it  can.  It  goes  through  all  the 
Protoletters  it  finds  there;  if  the  Protoletter’s  State  flag  is  set  to  "bogus",  it  sim¬ 
ply  deletes  the  Protoletter  (this  would  be  the  case  if  the  machine  was  in  the  mid¬ 
dle  of  receiving  the  Protoletter  when  it  crashed).  Otherwise,  each  Protoletter 
accurately  shows  the  state  of  delivery  of  that  letter;  the  server  may  simply  pick 
up  delivery  where  it  left  off,  since  the  header  contains  a  list  of  who  still  needs  to 
receive  the  letter. 

When  it  has  cleaned  up  its  Workspace,  it  is  ready  to  get  connected  to  the  rest  of 
the  system  and  get  brought  up  to  date.  To  do  this,  it  broadcasts  a  message  to  all 
the  Maildriver  machines,  a  message  which  translates  into  English  as  Tm  here! 
I’m  here!  So  who’s  the  Master  these  days,  anyway?".  Only  the  Caretaker  sends 
back  a  response,  identifying  itself  as  Caretaker  on  the  Master  machine.  The 
Caretaker  then  proceeds  to  provide  the  reborn  machine  with  requests,  which  it 
takes  from  the  appropriate  DMCL.  When  the  DMCL  has  been  emptied,  the 
Caretaker  removes  the  reborn  machine  from  the  Dead  Machine  List,  and  notifies 
the  Master.  The  Master  changes  its  file  descriptor  back  to  the  Server  for  the 
reborn  machine  and  sends  word  to  the  Lieutenant,  which  removes  that  machine 
from  its  DML  and  destroys  that  machine’s  DMCL.  The  machine  is  now  officially 
back  up. 

Note  that  the  Caretaker’s  DMCL  is  updated  each  time  a  request  is  successfully 
completed  from  it  (that  is,  each  time  the  reborn  machine  successfully  complete's 
the  request).  Thus,  if  the  reborn  machine  dies  again  before  the  DMCL  is  emp¬ 
tied,  the  Caretaker  still  has  an  up-to-date  list  of  what  needs  to  be  done  to  catch¬ 
up  that  machine.  However,  the  Lieutenant  is  not  informed  upon  successful  com¬ 
pletion  of  each  catch-up  request.  The  likelihood  of  two  machines,  including  the 
Master,  going  down  at  once  is  rather  lean;  in  that  rare  event,  the  reborn  machine 
will  simply  be  issued  a  few  redundant  requests  which  will  be  essentially  ignored 
[again,  writing  a  file  which  is  already  there  is  defined  to  be  a  noop,  as  is  deleting 
a  letter  which  is  not  there).  This  seems  a  reasonable  tradeoff. 

11.  When  the  Master  crashes 

Eventually,  the  fates  conspire  to  be  truly  nasty,  and  the  Master  will  crash. 

It  is  the  job  of  the  Lieutenant  to  poll  the  Master  regularly  to  make  sure  it  L  still 
up.  If  the  Lieutenant  finds  that  the  Master  is  dead,  drastic  measure's  are  in 
order.  The  Lieutenant  take's  over  and  becomes  Master.  Eortunatelv,  it  has  been 
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provided  with  sufficient  information  and  is  up  to  the  task. 

The  first  thing  that  the  Lieutenant  does  is  to  appoint  a  new  Lieutenant  (see  sec¬ 
tion  10).  It  then  writes  out  all  its  special  information  to  appropriate  files  which 
will  be  read  by  the  master  processes  once  they  get  started.  Finally,  it  execs  the 
Master  process. 

The  Master  process  now  starts  up  as  usual,  setting  up  connections  and  forking 
processes  off.  It  broadcasts  a  message  to  all  surviving  Maildriver  machines;  in 
English,  this  message  would  be  akin  to  "The  Master  is  dead!  Long  live  the  Mas¬ 
ter!!  Anyway,  I’m  the  new  master,  so  please  send  me  any  requests  from  now  on. 
including  anything  for  which  you  are  still  waiting  for  acknowledgement  from  the 
old  Master". 

After  everything  is  set  up,  it  reads  the  file  left  behind  for  it  by  the  Lieutenant. 
This  file  contains  all  requests  which  were  begun  but  uncompleted  when  the  old 
Master  died.  The  new  Master  reads  this  file  and  re-issues  each  of  these  requests 
from  scratch.  This  will  perhaps  cause  some  redundant  requests  to  go  out,  but 
again  this  will  not  harm  anything  (hey,  nobody  ever  said  that  replacing  the  Mas¬ 
ter  was  going  to  be  easy!).  As  usual,  a  delete  request  for  a  partially  delivered 
letter  is  queued  until  delivery  is  complete.  If  a  request  is  a  delivery  request,  then 
the  Lieutenant  will  only  have  the  filename  in  his  records.  To  get  the  actual  file, 
it  first  searches  its  own  storage.  If  it  cannot  find  it  there  (i.e.  it  never  received 
its  copy),  it  contacts  the  original  recipient  machine  (again,  the  original  recipient 
machine  is  part  of  the  filename)  to  get  the  file.  If  the  original  recipient  machine 
is  not  available  (i.e.  is  dead),  it  broadcasts  a  desperate  request  for  any  machine 
that  has  the  file.  If  no  copy  can  be  found,  then  delivery  will  have  to  wait  until  a 
machine  with  a  copy  comes  back  to  life.  It  should  bo  emphasized  that  this  would 
be  an  extremely  rare  occurance.  In  this  case,  it  must  put  the  filename  in  a  special 
list,  which  it  checks  against  every  time  a  machine  comes  back  to  life.  And,  of 
course,  it  notifies  the  new  Lieutenant  about  this. 

By  now,  there  is  a  new  Caretaker.  The  Caretaker  may  also  find  files  for  it  left 
behind  by  the  Lieutenant.  The  Caretaker  examines  the  Cemetery  directory, 
looking  for  files  of  the  form  "FromLieu. <machine>'\  Such  a  files  contains  the 
DMCL  for  that  machine.  The  Caretaker  reads  these  files  in  and  builds  its  lists. 
It,  too,  must  go  through  a  desperate  search  for  any  sending  files  and  uses  the 
same  algorithm  as  the  Master  did.  When  it  has  finished  all  this,  it  then  simply 
proceeds  to  function  as  Caretaker. 

N\  hen  the  old  Master  comes  back  up.  it  goes  through  the  normal  restart  pro 
cedures  (see  section  10).  It  ceases  to  have  any  Master  status  and  erases  every¬ 
thing  in  its  Master  and  Cemetery  directories  (similarly,  when  an  old  Lieutenant 
comes  back  up,  it  erases  its  Lieutenant  directory.  In  short,  whenever  a  machine 
comes  back  up,  it  erases  both  its  Master  and  Lieutenant  directories).  The  only 
way  for  that  machine  to  become  Master  again  is  for  an  administrator  to  issue  a 
"change  master"  request. 


12.  When  a  process  crashes 


The  Maildriver  system  consists  of  many  different  processes  interacting  together. 
It  is  possible  that  any  given  process  could  fail  for  a  number  of  reasons.  While 
Maildriver  contains  many  checks  and  balances  to  handle  machines  crashing,  to 
have  the  same  checking  on  a  per-process  basis  would  be  absurdly  expensive.  On 
the  surface,  this  would  appear  to  be  something  of  a  problem,  but  in  fact  it  is  not 
so.  To  see  this,  one  must  consider  the  reasons  that  a  process  might  die. 

Processes  can  die  for  any  of  the  following  reasons:  1)  The  process  exits  in  a  nor¬ 
mal  fashion,  (i.e.  a  call  to  exit( )),  2)  An  error  occurs  and  the  process  receives  some 
sort  of  signal  about  it  (such  as  S1GBUS  --  Bus  Prror),  3)  The  process  is  sent  some 
sort  of  signal  from  another  process  (such  as  SIGKILL  —  absolutely  kill  the  pro¬ 
cess),  or  -1)  The  operating  system  glitches  badly  and  somehow  loses  or  corrupts 
critical  information  about  the  process. 

Case  1  can  happen  on  occasion  under  controlled  circumstances,  in  which  case  the 
loss  of  the  process  is  expected  and  no  problem,  and  cannot  happen  otherwise 
because  of  the  way  the  code  is  written. 

Cases  1  should  not  ever  happen;  if  it  does,  it  means  that  something  is  seriously 
wrong  with  the  system  (i.e.  hardware),  and  it  is  reasonable  to  expect  the  system 
to  crash  imminently  anyway,  so  it  is  pointless  to  try  to  do  anything  about  it,  and 
in  any  case  Maildriver  already  handles  the  case  of  a  machine  crashing,  so  there 
is  no  problem. 

Cases  2  and  3  appear  more  interesting.  Assuming  that  the  system  has  already 
been  properly  written,  debugged  and  severely  tested,  neither  case  should  ever 
occur.  If  either  does  occur,  it  is  likely  (though  not  definite)  that  something  is 
seriously  wrong  with  the  system.  It  might  be  possible  to  take  some  sort  of  correc¬ 
tive  action,  but  in  all  likelihood  human  intervention  is  desirable,  to  find  out  just 
what's  wrong.  In  light  of  this  likelihood,  and  in  view  of  the  fact  that  tracking 
dead  processes  is  expensive,  the  Maildriver  solution,  radical  though  it  may 
seem,  is  as  follows:  crash  the  system.  Yes,  you  read  that  right.  Crash  the  sys¬ 
tem.  Although  first  a  message  should  be  written  somewhere  where  a  human  will 
find  it.  explaining  the  decision  to  crash.  Actually,  crashing  the  system  is  not  as 
radical  as  it  sounds.  Operating  systems  do  it  when  they  find  bad  inconsistencies; 
Maildriver  is  just  doing  the  same  thing.  And  again,  it  is  easily  handled,  since 
there  already  exists  safety  valves  for  dead  machines. 


13.  Summary  of  Robustitude 

Maildriver  is  clearly  very  robust.  Because  mail  is  replicated  to  all  Maildriver 
machines,  a  user  can  connect  to  any  machine  and  get  full  service.  In  particular, 
if  a  machine  crashes  while  a  user  is  using  it,  his  client  program  can  simply  con¬ 
nect  to  another  machine  and  repeat  the  last  request,  with  no  loss  of  functionality. 
Since  every  action  taken  by  the  system  requires  acknowledgement  and  unack¬ 
nowledged  requests  are  repeated,  one  way  or  another,  there  can  be  no  lost  or  par¬ 
tially  completed  requests.  If  acknowledgement  is  given,  then  the  request  is 
guaranteed  to  be  safely  at  the  state  indicate  by  the  acknowledgement.  And  no 
damage  can  be  done  by  redundantly  repeated  requests.  Since  a  dead  machine  is 
caught  up  to  the  rest  of  the  world  when  it  conn’s  back  on  line,  there  is  never  any 
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inconsistency  in  the  system.  And  since  the  Lieutenant  watches  over  the  Master, 
no  particular  Maildriver  machine  is  critically  important;  any  can  fail  and  life 
will  still  go  on.  And  if  a  process  should  fail,  it  brings  the  whole  machine  down 
with  it,  to  insure  no  inconsistencies  accrue. 

In  short,  every  eventuality  is  covered.  A  user  need  never  be  aware  of  the  state 
of  the  system.  As  long  as  he  can  connect  to  a  live  machine,  he  has  full  services 
available  to  him. 


Part  IV.  Conclusion 

Maildriver  is  a  highly  reliable  system  for  computer  mail.  If  it  has  any  serious 
flaws,  they  are  likely  in  the  realm  of  time  efficiency.  Full  replication  is  expensive. 
Still,  it  is  not  clear  just  how  bad  and  in  what  ways  the  system  clogs  up.  None 
the  less,  a  few  possibilities  for  possible  improvement  suggest  themselves  and  are 
mentioned  here,  just  as  a  starting  point.  In  practice,  one  would  have  to  experi¬ 
ment  and  find  just  the  right  combination. 

One  clear  potential  problem  is  the  overloading  of  the  Master.  It  is  not  clear  just 
how  much  the  Master  machine  will  burdened  by  its  extra  responsibilities.  Hope¬ 
fully  the  extra  burden  will  not  be  too  bad.  If  it  is,  though,  certain  tasks  could 
potentially  be  delegated  to  other  machines,  at  varying  loss  of  reliability.  For 
example,  the  Lieutenant  could  be  put  in  charge  of,  say,  catching  up  a  machine 
which  has  been  down.  The  algorithm  would  remain  the  same,  with  the  roles  of 
Lieutenant  and  Master  switched. 

Another  possibility  which  might  improve  performance  would  be  a  change  in  the 
letter  delivery  algorithm.  Instead  of  the  recipient  machine  sending  the  Pro 
toletter  to  the  Master,  who  then  distributes  it  across  the  system,  one  could 
instead  have  the  recipient  machine  simply  send  a  message  to  the  Master  saying 
that  it  has  received  a  new  letter.  The  Master  would  then  direct  the  other 
machines  to  get  the  Protoletter  directly  from  the  recipient  machine.  While  this 
would  result  in  a  substantial  lowering  of  the  Master's  overhead  and  distribute  it 
more  evenly  across  the  network,  there  is  a  price  to  be  paid  in  terms  of  reliability. 
Specifically,  the  recipient  machine  would  have  to  acknowledge  to  its  client  before 
a  second  copy  of  the  Protoletter  was  made;  this  would  leave  open  the  possibility 
that  the  recipient  machine  could  crash  after  acknowledging  but  before  actually 
effecting  replication.  Thus,  the  acknowledgement  to  the  client  was  in  fact  a  lie. 
It  follows  tiiat  there  are  only  two  ways  to  avoid  telling  this  lie.  One  would  be  to 
delay  acknowledgement  until  all  machines  have  taken  their  conv;  this  is  silly  as  it 
results  in  more  overhead  than  the  original  scheme,  and  keeps  the  user  waiting  to 
boot.  The  other  would  be  to  wait  until  at  least  one  other  machine  has  gotten  it. 
The  problem  with  this  is  that  it  creats  additional  overhead  in  terms  of  keep  track 
of  just  who  has  what  copies  of  what.  The  logical  extension  to  fix  this  problem  is 
to  not  wait  until  any  old  machine  gets  it,  but  to  wait  until  some  specific  machine 
has  it.  I  his  brings  us  back  to  where  we  started  from,  since  this  is  essentially  the 
original  Master  scheme  already  presented.  Given  then  that  there  is  no  way  to 
decrease  the  overhead  without  losing  reliability  (and  credibility),  it  comes  down 
to  playing  trade  offs.  In  general  this  change  would  seem  an  ill  advised  move 
unless  the  Master  was  severely  overloaded  and  the  change  would  remedy  the 
situation  in  a  noticeable  way. 
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It  is  also  not  clear  just  how  long  the  ^receive  mail”  function  will  take  in  terms  of 
retrieving  partially  delivered  mail.  It  may  be  that  the  extra  time  may  render  it 
not  worth  it  for  some/many/most  users.  It  would  be  a  very  simple  thing  to  add 
a  ’’quick  receive  mail”  function  which  would  just  fetch  the  mail  off  the  local 
machine.  At  this  writing,  though,  it  is  not  clear  that  this  is  necessary. 

In  any  case,  despite  its  possible  drawbacks,  Maildriver  seems  a  solid  system.  It 
clearly  appears  to  meet  its  design  goals  of  high  reliability,  full  access,  and  efficient 
bulk  mailing. 


And  it  certainlv  has  a  catehv  name. 


Note:  we  assume  the  following  bug  fixed: 
fcntl(fd,  F_SETFL,  fcntl(fd,  F_GETFL)  |  FASYNC) 

works  for  sockets  (and  hence  pipes),  and  that  moreover,  the  signal  handler  which 
rocesses  the  SIGIO  receives  the  value  of  fd  as  the  arg  ’’code",  (see  fentl(2).  sisrvecf2 


MASTER 

This  process  Is  the  master.  It  must  set  things 
up  Initially,  fork  off  Sender  and  Caretaker 
processes,  and  appoint  a  Lieutenant.  Then  It 
proceeds  to  perform  the  normal  duties  of 
coordinating  Maildrlver. 


^include  <signal.h>,  etc. 
#deflne  DONT_CARE  (-1) 


Int  lieutenant,  caretaker,  ‘machines,  *save_machlnes;  /*  file  deBcripters  */ 


something  * ActlveMachlneList; 
extern  void  EmergencyBailoutQ; 


/ ‘  list  of  active  machines  •  / 


struct  {  char  ‘filename; 

Int  machine; 

}  ‘DoYouIIavelt;  /«  list  of  files  we  couldn’t  find  */ 


HaveLetterDeletcd(message)  whatever  message; 

{  char  ‘filename  =  get  filename  from  message, 

•  user  =  get  user  who  wants  deletion  from  message; 

Int  machine; 

FILE  ‘pending  =  create  <fl!ename>.deletlng  in  Master  directory; 
for  (each  machine) 

{  send  a  deletion  request  to  machine,  including 
filename  and  user; 
write  machine  to  pending; 


fclose(pendlng);  } 
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/* - */ 

SendLetterToMachlne(message,  machine,  fp)  whatever  message;  Int  machine;  FILE 
*fp; 


{  if  (machine  ==  original  server) 

send  "mark  valid  and  process"  to  it 
else 

if  (machlnes[mach!ne]  ==  =  caretaker) 
send  filename  of  letter  to  It 
else 

send  entire  letter  to  It; 
get  "I  got  It"  from  machine; 
write  machine  to  fp;  ) 


SendOfTLetter(mcssage,  first_machlne)  whatever  message;  Int  flrst_machlne; 
{  Int  machine; 

char  ‘filename  =  filename  in  message; 

FILE  ‘pending  =  create  < filename >. sending  in  Master  directory; 

If  (first  machine  !=  DONT_CARE) 

SendLetterToMachlne(message,  flrst_machlne,  pending); 

for  (each  machine) 

If  (machine  !=  first  machine) 

SendLetterToMachlnc(message,  machine,  pending); 

fclosc(pending);  ) 


/*  "  -  — "/ 

ProcessLlfeOrDeath(message);  whatever  message; 

{  int  machine  =  the  machine  mentioned  in  the  message; 
if  (machine  has  died) 

{  send  message  to  machines[machine]  about  death; 
machincs[machlne]  =  caretaker; 
if  (machine  =  =  the  Lieutenant) 

{  lieutenant  =  pick  &  connect  to  some  random  lieutenant; 
Inform  it  that  It  is  now  lieutenant; 
for  (each  .sending  &  .deleting  file  in 
the  Master  directory) 

send  the  send/delete  request  to  lieutenant; 
send  it  the  ActiveMachlneLlst; 

} 

} 

else 

If  (machine  has  come  back  to  life) 


{  send  message  to  machlnes[machlne]  about  rebirth; 

machines  [reborn  machine]  =  save_machInes[reborn  machine]; 

for  (each  entry  In  the  DoYouHavelt  list) 

if  ((machine  ==  (othermachlne  =  machine  In  entry))  && 
((ask  machine  if  it  has  file  In  entry)  ==  FOUND)) 

{  read  file  from  machine  Into  Master  directory; 
remove  entry  from  DoYouHavelt  list; 
send  the  file  to  othermachlne; 

} 

} 

Inform  lieutenant  of  birth  or  death;  } 
- ./ 

ProcessRequest(message);  whatever  message; 

{  Int  machine  =  requesting  machine  from  message; 
switch(type  of  request) 

{  SendLettcr:  read  letter  from  server  and 

put  It  Into  the  Master  directory; 
send  filename  of  letter  to  Lieutenant; 
await  "I  got  It"  from  Lieutenant; 
send  "I  got  it"  to  server; 
put  filename  of  letter  on  delivery  queue; 
break; 

DeleteLetters:  send  request  to  Lieutenant; 

await  "Got  it"  from  Lieutenant; 
send  "Got  It"  to  server; 

check  for  letter  in  Master  directory; 

If  (its  there) 

put  message  on  delete  queue; 
else 

HaveLetter  Deleted  (mersage); 
break; 

RcceiveMail:  recipient  =  get  recipient  from  message; 

for  (each  letter  In  delivery  queue) 
if  (recipient  will  receive  letter) 

SendOfFLetter(mcssage,  machine); 
break; 


AddUser:  send  message  to  Lieutenant  about  addition; 

await  "Got  it"  from  Lieutenant; 
fp  =  creat  add. < user >; 
for  (each  machine) 

{  send  add  request  to  that  machine; 
write  machine  to  fp; 
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fclose(fp); 
break; 

DeleteUser:  send  message  to  Lieutenant  about  deletion; 

await  "Got  It"  from  Lieutenant; 
fp  =  creat  del.<user>; 
for  (each  machine) 

{  send  delete  request  to  that  machine; 
write  machine  to  fp; 

} 

fclose(fp); 
break; 

ChangeMaster:  send  message  to  new  master  about  change; 

send  all  Master  flies  &  queues  to  new 
Master; 

for  (  each  machine) 

send  message  about  change  of  master; 
send  message  to  caretaker  about  change; 
send  message  to  lieutenant  about  change; 
exlt(0); 

}  } 

/*- 

ProcessAcknowledgement(message)  whatever  message; 

{  char  ‘filename  =  get  filename  of  letter  from  message; 
swltch(type  of  acknowledgement) 

{  delivered  letter:  open  <fllename> .sending  In  Master  directory; 
delete  acknowledging  machine  from  it; 

If  (<filename>. sending  Is  now  empty) 

{  send  "forget  It"  to  Lieutenant; 
await  I  got  it"  from  Lieutenant; 
delete  <fllename>. sending; 
delete  <fllename>  from  Master  dir. 

If  (filename  Is  on  delete  queue) 

{  remove  it  from  queue; 

HaveLet.terDeleted(filename); 

> 

> 

break; 

finished  deletion:  open  < fllcnsme> .deleting  in  Master  directory; 
delete  acknowledging  machine  from  It; 

If  (<fllename>.deletlng  is  now  empty) 

{  send  "forget  It"  to  Lieutenant; 
await  "I  got  It"  from  Lieutenant; 


delete  <  filename  >. deleting; 
delete  <fllename>  from  Master  dir. 

} 

break; 

finished  add  user:  open  add.<user>  In  Master  directory; 

delete  acknowledging  machine  from  It; 

if  (add.<user>  is  now  empty) 

{  send  "forget  it"  to  Lieutenant; 
await  "I  got  it"  from  Lieutenant; 
delete  add. <  user  >; 

} 

break; 


finished  del.  user:  open  del.<user>  in  Master  directory; 

delete  acknowledging  machine  from  it; 

if  (del.<user>  is  now  empty) 

{  send  "forget  it"  to  Lieutenant; 
await  "I  got  it"  from  Lieutenant; 
delete  del.  <  user  >; 

} 

break; 


}  } 


/*  - . . */ 

void  MessageHandler(slg,  fd)  lnt  slg,  fd; 

{  read  message  from  fd; 
if  (fd  ==  caretaker) 

ProcessLlfeOrDeath(  message); 
else 

if  (message  is  new  request) 
ProcossReque8t(message); 
else 

if  (message  Is  an  acknowledgement) 
Process  Acknowledgement  (message); 

send  "Got  it."  to  fd;  } 


/ - / 

StartProcessesQ  {  lnt  *for_caretaker  =  cal!oc(#  of  active  machines,  sireof(lnt)), 
machine; 
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M 


machines  =  calloc(#  of  active  machines,  slreof(lnt)); 


_  . 
«*>  M 


/•  first,  set  up  all  the  pipes  we’re  gonna  need  */ 

for  (machine  =  0;  machine  <  #  of  active  machines;  machined--)-) 
{  get  two  pipes; 

put  one  Into  for_caretaker[machlne]; 
put  the  other  Into 

both  machinesfmachlne]  and  eave_machlnes[machlne]; 
call  fcntl  to  set  them  up  to  be  asynchronous; 

} 

caretaker  =  yet  another  pipe; 

for  (machine  =  0;  machine  <  #  of  active  machines;  machlne++) 
{  fork  off  a  new  process; 
exec  sender  with  args: 

machine,  machines[machlne],  for_caretaker [machine]; 

} 

fork  ofT  another  process; 

exec  Caretaker  with  for_caretaker  array  as  args; 

for  (machine  =  0;  machine  <  #  of  active  machines;  machlne++) 
close(for_caretaker[machlne]);  } 


/* - - - */ 

/*  on  a  change  of  master,  we  only  have  a  filename,  we 
need  the  actual  file  to  send!  desparately  try  to 
find  the  file  */ 

FindFile(mes8age)  whatever  message; 

{  char  ‘filename  =  filename  contained  In  message; 

If  ((scan  each  user  InBox  for  filename)  =  ==  FOUND) 
link(found  file,  Cemetary); 
else 

If  (machine  contained  In  filename  /*  machine  which  originally 
received  the  letter  */  Is  not  on  DeadMachlneLIst) 

{  send  request  for  that  file  to  that  machine; 
read  letter  from  machine  Into  Cemetary; 

} 

else 

If  ((ask  every  other  machine  If  they  have  it)  ==  FOUND) 
read  letter  from  that  machine  Into  Cemetary; 
else 

put  the  filename  and  the  machine  that  needs  It 
on  the  DoYouHavelt  list;  } 


/. - ♦  / 

maln(argc,  argv)  lnt  arge;  char  *»argv; 

{  slgnal(SIGSEGV  etc,  EmergencyBallout); 
if  (argv  indicates  this  Is  a  fresh  startup) 
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{  read  a  startup  file  to  find  which  machine  is  initial  lieutenant; 
lieutenant  =  get  socket  connection  to  lieutenant; 

} 

else  /*  we  are  replacing  a  dead  master  •/ 
socket  =  whatever  argv  says  the  descriptor  Is; 

read  a  startup  file  to  build  ActlveMachlneLlst; 

StartProcesses(); 

slgnal(SIGIO,  MessageHandler); 

If  (we  are  replacing  a  dead  master) 

{  read  FromLleutenant  file  In  Master  directory 
and  re-send  all  the  requests  found  therein; 

If  request  Is  send  letter 
FlndFile(request); 

broadcast  "I'm  the  new  Master”  to  all  machines; 

} 

for(;;) 

{  slgpause(O); 

while  (delivery  queue  Is  not  empty) 

{  SendOffLetter(flrst  thing  on  queue,  DONT_CARE); 
remove  first  thing  from  queue; 


CARETAKER 

The  Caretaker  la  In  charge  of  the  cemetary.  In 
particular,  it  ia  responsible  for  knowing  which 
machlnea  are  dead  or  alive.  When  a  dead  server 
machine  cornea  back  to  life,  the  Caretaker  is 
responsible  for  bringing  it  up  to  date. 


#include  <signal.h>,  etc. 

#deflne  NO_PULSE  ??  /*  NO_PULSE  should  have  a  numeric  value  Indicating 

how  much  time  to  wait  while  polling  a  machine 
before  deciding  it  is  dead  (l.e.  haa  no  pulse). 

Inflnately  tweekable  */ 

#deflne  TfMEJBETWEEN_POLLS  ??  /*  same  kind  of  deal  */ 


Int  ‘machines,  ‘pulse,  master; 

something  « ActlveMachlneLi3t;  /*  list  of  active  machines  */ 

struct  {  char  ‘filename; 
int  machine; 

}  ‘DoYouHavelt;  /*  list  of  flies  we  couldn’t  find  •/ 


Queue  ‘DMCL; 


/*  array  of  Dead  Machine  Catchup  Lists  •/ 


extern  void  EmergencyBallout(); 
int  FinUhUpAndDie  =  0; 


GetPostMortemRequest(message) 
whatever  message; 

{  int  machine  =  get  machine  from  message; 

if  (request  in  message  is  for  letter  deletion) 
for  (each  request  In  DMCL[machlne]  queue) 
if  (request  in  queue  is  to  send  the  letter 
that  we  now  want  to  delete) 

{  remove  the  send  request  from  the  queue; 
remove  the  link  in  the  Cemetery; 
send  back  "I  got  It"  to  master; 
return: 
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if  (request  is  for  letter  sending) 

{  get  filename  from  message; 

llnk(fllename  In  Master  directory,  same  name  In  Cemetery); 

} 

add  request  to  DMCL[machlne]; 
send  back  "I  got  It"  to  master; 

} 

/* 

SendPreNatalRequest(message) 
whatever  message; 

{  lnt  machine  =  resurrecting  machine  (from  message); 

If  (DMCL[machlne]  Is  not  empty) 

{  send  first  thing  on  DMCL[machlne]  to  machlnes[machlne]; 

If  (first  thing  is  send  letter  request) 

send  letter  over  (using  link  in  Cemetary); 
remove  first  thing  from  DMCL[machlne]; 

} 

else 

{  send  "That’s  all  folks!"  to  machine; 
add  machine  to  active  machine  list; 
send  '<  machine  >  Is  back  on  line"  to  master; 

} 

} 

/• 

ResurrectMachlne(message) 
whatever  message; 

{  lnt  machine  =  get  machine  from  message; 

send  "HI  there.  I’m  the  master"  to  machine; 

for  (each  entry  in  the  DoYouHavelt  list) 

If  ((machine  ==  (othermachine  =  machine  in  entry))  &&. 
((ask  machine  if  it  has  file  in  entry)  ==  FOUND)) 

{  read  file  from  machine  into  Cemetary; 
remove  entry  from  DoYouHavelt  list; 
if  (othermachine  Is  dead) 

put  send  request  in  DMCL[othermachlne]; 
else 

send  the  file  to  othermachine; 

} 


} 

/* 


“*/ 


SendPreNatalRequest(mcssage); 


void  AnswerThePhone(slg,  fd) 

Int  slg,  fd; 

{  read  message  from  fd; 

If  (message  Is  change-of-master  notification) 
FinlshUpAndDle  =  1; 
else 

If  (fd  ==  master) 

GetPostMortemRequest(message); 
else 

if  (message  is  "Hey,  Pm  here!”  from  ex-dead  machine) 
ResurrectMachlne(message); 
else 

If  (message  is  "Ok,  got  It.  What  nextf") 
SendPreNatalRequest(message); 

} 

/* - */ 

/*  on  a  change  of  master,  we  only  have  a  filename,  we 
need  the  actual  file  to  send!  desparately  try  to 
find  the  file  * / 

FlndFile(message) 
whatever  message; 


{  char  ‘filename  =  filename  contained  In  message; 


If  ((scan  each  user  InBox  for  filename)  ==  FOUND) 
link(round  file,  Cemetary); 
else 

If  (machine  contained  In  filename  /*  machine  which  originally 
received  the  letter  * /  Is  not  on  DeadMachlneList) 

{  send  request  for  that  file  to  that  machine; 
read  letter  from  machine  Into  Cemetary; 

} 

else 

If  ((ask  every  other  machine  If  they  have  It)  ==  FOUND) 
read  letter  from  that  machine  Into  Cemetary; 
else 


put  the  filename  and  the  machine  that  needs  it 
on  the  DoYouHavelt  list; 


maln(argc,  argv) 
lnt  argc; 
char  “argv; 


{  lnt  machine; 


slgnaI(SIGSEGV  etc,  EmergencyBailout); 


parse  argv  to  build  an  array  ("machines")  of  descriptors  for 
communicating  with  the  sending  processes  for  each  machine; 

/*  the  descriptors  are  already  open—  we  have  Inherited  them— 
we  just  need  to  figure  out  which  Is  which  *  / 

master  =  whatever  descriptor  argv  says  goes  to  the  master; 

Initialize  DoYouHavelt  list; 

read  a  startup  file  to  build  ActlveMachtneLlst; 

request  =  allocate  space  for  a  request  queue  per  machine; 

for  (each  machine  on  list) 

{  pulse[machine]  =  set  up  a  socket  connection  (‘synchronous*) 
to  that  machine’s  pulse  process; 
request[machlne]  =  Initialize  queue  for  that  machine; 

} 

for  (each  file  In  the  Cemetary  of  the  form  FromL!eu.<machlne>) 

{  put  machine  In  the  DeadMachlneLlst; 

read  the  file  into  DMCLfmachinej.  while  doing  so, 
check  for  and  remove  send/delete  pairs; 
if  (the  request  Is  a  send  request) 

FlndFile(request); 
delete  the  file; 

} 

signal(SIGIO,  AnswerThePhone); 
for(;;) 

{  for  (each  active  machine) 

{  send  "Are  you  there?"  to  puise[machlne]; 

attempt  to  read  "Yes  I’m  here"  from  pulse[machlne], 
but  time  out  after  NO_PULSE  time; 

If  (timed  out) 

{  remove  machine  from  active  list; 

send  "<machlne>  has  died"  to  master; 

} 

} 

sleep(TIME_BETWEEN_P  OLLS); 

If  (FinlshUpAndDle) 
for(;;) 

{  if  (all  the  request  queues  are  empty) 
exit(O); 

sleep(TTME_BETWEEN_POLLS); 


SENDER 


This  process  accepts  requests  from  the  Master 
or  Caretaker  and  sends  them  to  Its  target 
machine  (specified  in  "argv"). 


\“ . . . **** . . . * . / 

#include  <s!gnal.h>,  etc. 

/• - */ 

lnt  target,  master,  caretaker;  /*  file  descriptors  •/ 
struct  q_record  ‘Queue;  /•  message  queue  •/ 

extern  void  EmergencyBalloutQ; 
int  MachlneAlive  =  1,  FinlshUpAndDle  =  0; 

/* - */ 

InputHandler(sig,  fd) 
int  sig,  fd; 

{  read  message  from  fd; 
if  (fd  ==  target) 

{  verify  that  this  is  an  acknowledgement  meant  for 
someone  else  (that  is,  Master  or  Caretaker); 

/*  this  includes  "I  am  here!"  for  Caretaker  */ 
send  message  to  the  someone  else; 

} 

else 

if  (message  is  a  death  notification) 

MachlneAlive  =  0; 
else 

if  (message  is  a  rebirth  notification) 

MachlneAlive  =  1; 

if  (message  is  change-of-master  notification) 

FlnishUpAndDie  =  1; 
else 

{  append  message  onto  Queue; 
send  back  "I  got  it"  over  fd; 

} 

} 

/. - ./ 

/*  note:  things  can  get  added  to  the  Queue  (via  Interrupts)  while 
this  function  runs,  except  during  the  critical  section  */ 


ProcessOutputQueueQ 


{  whlle(the  Queue  is  non-empty) 

{  get  first  message  on  Queue; 
send  it  to  target; 

while  (MachineAlive)  /•  could  change  via  interrupt  •/ 
attempt  to  read  acknowledgement  from  target,  but 
without  blocking; 

if  (!  MachineAlive) 
goto  DIED; 

sigblock(SIGIO);  /*  begin  critical  section  •/ 

remove  this  message  from  Queue; 

sigsetmask(O);  /*  end  critical  section  •/ 

} 

return; 

DIED: 

send  entire  Queue  to  caretaker; 

} 

/* - - - */ 

maln(argc,  argv) 
int  argc; 
char  **argv; 

{  slgnal(SIGSEGV  etc,  EmergencyBallout); 

sscanf(argv[2],~%d",  &master); 

68canf(argv[3],'%d”,  &caretaker); 

target  =  set  up  and  open  connection  to  target  machine  (argv[l]); 

/«  arrange  for  connection  to  be  asynchronous  •/ 

/•  as  we  will  also  be  handling  acknowledgements  •/ 

/*  back  to  the  Master  or  Caretaker  •/ 

fcntl(targct,  F_SETFL,  fcntl(targct,  F_GETFL)  |  FASYNC); 

/*  we  have  already  inherited  asynchronous  pipes  to  the  Master  and 
Caretaker.  Install  handler  to  talk  to  them  •/ 
slgnal(SIGIO,  InputHandler); 

/*  now  hang  out  and  process  Interrupts  as  they  come  in  * / 
for  (;;) 

{  slgpause(O); 

ProcessOutputQueueQ; 
if  (FlnlshUpAndDie) 


LIEUTENANT 


The  Lieutenant’s  Job  is  to  monitor  the  doings 
of  the  Master,  so  as  to  be  able  to  take  over 
if  the  Master  dies.  It  is  also  the  Master’s 
poller  and  Initiates  the  take  over  If  the 
Master  shows  no  pulse. 


\ . . . . . . . . / 

^include  <signal.h>,  etc. 


#deflne  NO_PULSE  TT  /*  NO_PULSE  should  have  a  numeric  value  Indicating 
how  much  time  to  wait  while  polling  a  machine 
before  deciding  it  is  dead  (l.e.  has  no  pulse). 

Inflnately  tweekable  */ 

#deflne  TIME_BETWEEN_POLLS  ??  /*  same  kind  of  deal  */ 

/, - */ 

something  ‘DeadMachlneLlst;  /•  list  of  dead  machines  */ 

Queue  pending,  *DMCL;  /*  DMCL  =  Dead  Machine  Catchup  List  •/ 

/* - */ 

void  MessageHandler(sig,  fd) 
int  sig,  fd; 

{  whatever  message  =  read  message  from  fd; 

Bwltch(meBsage) 

{  "sending  letter": 

"deleting  letter": 

"adding  user": 

"deleting  user”:  put  request  on  pending  queue; 

for  (each  machine  on  DeadMachlneLlst) 
put  request  on  DMCL  [machine]; 


"letter  sent": 

"letter  deleted": 

"user  added": 

"user  deleted":  remove  request  from  pending  queue; 

"new  master":  connect  to  new  master; 


"dead  machine":  put  dead  machine  on  DeadMachlneLlst; 


I 


"live  machine":  delete  DMCL(machlne)  queue; 

remove  machine  from  DeadMachlneLlst; 


I 


send  ”1  got  it"  to  master 


BecomcMasterQ 

{  somehow  pick  a  machine  to  be  next  lieutenant; 

set  up  connection  to  that  machine  (and  Inform  him 
that  he  Is  now  lieutenant); 

write  the  pending  queue  to  the  file  FromLleutenant 
In  the  Master  directory; 

for  (each  machine  In  the  DeadMachlneLlst) 

write  a  file  FromLIeu.<machlne>  In  the  Cemetary 
containing  the  contents  of  DMCL[machlne); 

exec  Master  process  (passing  fd  of  new  Lieutenant,  and  the  fact 
that  this  is  a  replacement,  not  a  fresh  start); 


{  lnt  machine; 

slgnal(S!GSEGV  etc,  EmergencyBallout); 

Initialize  empty  DeadMachlneLlst; 

request  =  allocate  space  for  a  request  queue  per  machine; 
for  (each  machine  on  list) 

request[machlne]  =  initialize  queue  for  that  machine; 

master  =  set  up  socket  connection  to  master; 

/*  arrange  for  connection  to  be  asynchronous  */ 
fcntl(master,  F_SETFL,  fcntl(master,  F_GETFL)  |  FASYNC); 

pulse  =  set  up  socket  connection  to  pulse  process  on  master; 

slgna!(SIGIO,  MessageHandler); 

foi  (;;) 

{  send  "Are  you  there?”  to  pulse; 

attempt  to  read  ”Yes  I’m  here”  from  pulse, 
but  time  out  after  NO_PULSE  time; 

If  (timed  out) 

BecomeMasterQ; 


sleep(TIME_BETWEEN_POLLS); 


